ECPR

Install the app

Install this application on your home screen for quick and easy access when you’re on the go.

Just tap Share then “Add to Home Screen”

ECPR

Install the app

Install this application on your home screen for quick and easy access when you’re on the go.

Just tap Share then “Add to Home Screen”

Your subscription could not be saved. Please try again.
Your subscription to the ECPR Methods School offers and updates newsletter has been successful.

Discover ECPR's Latest Methods Course Offerings

We use Brevo as our email marketing platform. By clicking below to submit this form, you acknowledge that the information you provided will be transferred to Brevo for processing in accordance with their terms of use.

Quantitative Text Analysis

Member rate £492.50
Non-Member rate £985.00

Save £45 Loyalty discount applied automatically*
Save 5% on each additional course booked

*If you attended our Methods School in the July/August 2023 or February 2024.

Course Dates and Times

Monday 9 ꟷ Friday 13 August 2021
2 hours of live teaching per day
09:30 - 12:30 or 14:30 - 17:30 CEST

Kostas Gemenis

k.gemenis@cut.ac.cy

Cyprus University of Technology

This course provides a highly interactive online teaching and learning environment, using state of the art online pedagogical tools. It is designed for a demanding audience (researchers, professional analysts, advanced students) and capped at a maximum of 16 participants so that the teaching team can cater to the specific needs of each individual.

Purpose of the course

The course will introduce you to quantitative text analysis methods using examples from political science and related disciplines. It will look at manual content analysis but the emphasis will be on computer-assisted text analysis. 

We will cover methodological and practical aspects of the different methods. 

ECTS Credits

3 credits Engage fully with class activities
4 credits Complete a post-class assignment


Instructor Bio

Kostas Gemenis is Senior Researcher in Quantitative Methods at the Max Planck Institute for the Study of Societies.

His research interests include measurement in the social sciences, and content analysis with applications to estimating the policy positions of political actors.

He is currently involved in Preference Matcher, a consortium of researchers who collaborate in developing e-literacy tools designed to enhance voter education.

  @k_gemenis
Monday

We begin with some key concepts in content analysis and continue with the basics of coding text manually. Topics will include best practices for defining a coding scheme, selecting the appropriate documents, coding the documents, estimating inter-coder reliability using Krippendorff's alpha, scaling the coded data, and the possibilities brought by crowdcoding. The aim is to give you all the elements for designing a manual content analysis project. From Tuesday ꟷ Friday we will focus on computer-assisted text analysis. 

Tuesday

After an introduction on document pre-processing and some basic rules for good practice, this session will cover the construction and validation of dictionaries, and their use in sentiment analysis.

Wednesday

We focus on scaling methods in text analysis, covering supervised methods such as Wordscores, and unsupervised methods, such as Wordfish. Much emphasis will be placed on different ways to validate the output of scaling methods. 

Thursday

We continue with supervised classification methods, discussing different algorithms and evaluations metrics used in the so-called machine learning literature. 

Friday

In our last session, we discuss unsupervised classification methods and topic models in particular. Again, the focus will be on practical issues as well as the question of validating the classification output. Finally, we will compare the different methods and discuss trade-offs in quantitative text analysis.


How the course will work online

The course will be taught using a combination of lectures and seminars featuring ‘live’ and independent elements. 

Each day will have 60ꟷ90 minutes of pre-recorded lectures in addition to a set of course readings. 

There will also be an hour on Zoom for semi-structured discussion and Q&A with the Instructor on the topics of the day. 

You will have access to a 100-page e-book with annotated exercises illustrating the different text analysis methods in R statistical software. You can work on these on your own or take advantage of the daily Zoom time, when you can ask the Instructor questions, discuss the exercises, troubleshoot, and so on. 

You are also welcome to arrange one-to-one video-chat appointments with the Instructor, to discuss practical aspects of your own project.

You should be familiar with basic statistical concepts such as

  • measures of central tendency (mean, median)
  • dispersion (standard deviation)
  • tests of association (Pearson’s r)
  • inference (χ2, t-test).

These materials are covered in the first few chapters of introductory statistics or data analysis textbooks. A useful example is Pollock P.H. III, The Essentials of Political Analysis, fourth edition (Washington, DC: CQ Press, 2012). Chapters 2, 3, 6, and 7.

Some familiarity with R statistical software is also desirable but not necessary. We will use R Studio for all the seminar exercises.